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We discuss a method based on sampling minimally entangled typical thermal states (METTS) 
that can simulate finite temperature quantum systems with a computational cost comparable to 
ground state DMRG. Detailed implementations of each step of the method are presented, along 
with efficient algorithms for working with matrix product states and matrix product operators. 
We furthermore explore how properties of METTS can reveal characteristic order and excitations 
of systems and discuss why METTS form an efficient basis for sampling. Finally, we explore the 
extent to which the average entanglement of a METTS ensemble is minimal. 
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I. INTRODUCTION 

According to the elementary principles of quantum sta- 
tistical mechanics, the average of an observable is found 
by computing 

{A)=TT[pA] = ^Tr[e-^"A]. (1) 

While performing such a calculation directly is usually 
intractable for quantum many body systems, one can 
still approximate the expectation value of A by strate- 
gies based on sampling. For instance, many Quantum 
Monte Carlo methods take advantage of the Trotter de- 
composition of p to express an expectation value as a sum 
over virtual classical paths, which can then be numeri- 
cally sampled. In such approaches, the act of randomly 
choosing a path corresponds to sampling both the ther- 
mal and quantum fluctations together. 

However, it is also possible to sample the thermal fluc- 
tuations only, randomly selecting quantum states rather 
than classical ones. This may be done by expanding the 
trace in Eq. ([T]) in terms of an orthonormal basis \i) such 
that the expectation value of A takes the form 

i 
i 

In the first line we have used the cyclic property of the 
trace and in the second line the set of normalized states 
\4>{i)) are defined as 

\m - P{^)-'/'e''"^'\^) (3) 

together with the (unnormalizcd) probability distribu- 
tion 

Pi^) = {^\e-^"\^) . (4) 



One may then estimate (A) by sampling the states 
with probability P{i)/Z and averaging the expectation 
values {(t)(i)\A\(l){i)) computed at each step. 

Though such a procedure could be carried out us- 
ing any orthonormal basis of states not every choice 
would lead to an efficient algorithm. In particular, the 
computational cost of working with a quantum state goes 
up the more entangled it is, as measured by the von 
Neumann entanglement entropy across bipartitions of the 
system. 

Therefore a natural choice for the basis \i) is the set 
of classical product states (CPS). These are states with 
wavefunctions of the form 

V) = \ii)V2)Vz)...\in) . (5) 

where the ij label states in a local basis that may be 
chosen arbitrarily for each site j. As a result of their 
product structure, CPS have an entanglement entropy 
that is exactly zero. 

One therefore might expect that, of all ensembles of 
states !(/>(«)), those produced from CPS have the least 
entanglement. Moreover, such \4>{i)) have all of the prop- 
erties one intuitively expects of typical states of a ther- 
mal system. At high temperature they are effectively 
classical, while at lower temperatures they spontaneously 
break symmetries of the Hamiltonian in systems that ex- 
hibit long range order. They do not exhibit unphysical 
non-local correlations, and remain factorized over parts 
of the system which do not interact. And, expectations 
of observables {(t){i)\A\(f){i)) lie very close to the thermal 
average. For such reasons as well as others that we will 
discuss below, these states have therefore been designated 
minimally entangled typical thermal states, or METTS.I^ 

Not only does each METTS provide a good charac- 
terization of the thermal ensemble, but there exists a 
simple algorithm, the pure state algorithm, for sampling 
many of them efficiently. This is acheived by a random 
walk through the set of METTS where the next state 
is constructed from a CPS obtained by measuring, or 
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collapsing, the previous one. This collapsing step auto- 
matically ensures that the METTS are sampled with the 
correct distribution, and it allows one to optimize the 
the sampling process by an appropriate choice of basis. 
Moreover, collapsing each state may be seen as sampling 
the quantum fluctuations of the system, a fact that can 
be used to accelerate the calculation of certain physical 
observables. 

In this paper, we elaborate upon the implementation 
of the pure state algorithm for sampling METTS and 
its usage in realistic simulations. First, in section [II] we 
provide a more detailed explanation of the pure state 
algorithm and its implementation. We begin with a dis- 
cussion of various methods for carrying out the imagi- 
nary time evolution implicit in Eq. ([s]), as it is the most 
computationally intensive part of the algorithm and the 
step where the most significant approximations must be 
made. Then, we turn to the measurement of physical 
observables. While it is most convenient to calculate 
quantities like the total energy using matrix product op- 
erators (MPOs), we will show how local observables can 
be calculated more efhciently by taking advantage of the 
redundancy inherent in the matrix product state (MPS) 
formalism. We then conclude the section by discussing 
an efficient method for collapsing a pure state into a CPS. 
The basis into which the CPS is measured can be arbi- 
trary, and we will demonstrate how this freedom may be 
exploited both to improve the sampling autocorrelation 
time and to help calculate challenging observables. 



In section III we turn our focus to the properties of the 
METTS themselves. Properties of individual METTS 
can provide insight into the order that is present in a 
system as well as characteristic thermal fluctuations. On 
the other hand, studying the entire METTS ensemble 
will show us why it is such an efficient basis for sam- 
pling. Finally, we quantify the sense in which METTS 
are minimally entangled and conjecture that although 
there exist thermal decompositions with less entangle- 
ment than METTS at some temperatures, no decompo- 
sition can outperform an optimal METTS ensemble at 
all temperatures. 

Throughout, whenever we find it helpful to show the 
results of an explicit calculation, we will choose as our 
model of interest the S = 1 bilinear-biquadratic chain 
with Hamiltonian 



(6) 



In particular we will focus on two special cases of this 
model, namely 9 — 0, the 5 = 1 Heisenberg antiferro- 
magnet and 6 — arctan[l/3], the AKLT model posessing 
an 5 = 1 valence bond solid ground state.l^ At T — 0, 
both models are in the same phase, the Haldane phase, 
which is characterized by, among other things, a gap to 
all excitation^ and a doubly degenerate entanglement 



II. PRODUCING METTS WITH THE PURE 
STATE METHOD 

The pure state method consists of the following simple 
algorithm for producing a series of METTS such that 
they are correctly distributed with probability P{i)/Z: 

1. Choose a CPS 

2. Compute the METTS \<p{i)) = e~'^"/^\i)P{i)-^/^ 
and calculate observables of interest. 

3. Collapse a new CPS from with probability 
p{i — ?► i') = and return to step 2. 

We will shortly discuss how to carry out these steps in 
practice. However, let us first see why producing METTS 
in this way is guaranteed to give the correct distribution. 

Consider the ensemble of all CPS \i) initially dis- 
tributed with probability P{i) / 2. If we randomly choose 
a CPS |z) from this ensemble and then follow the steps 
above, the probability of measuring a particular CPS \ j) 
at the end of step 3 will be 



i i 
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That the resulting ensemble of all such |j) have the same 
distribution as the original ensemble is a consequence of 
the detailed balance condition 



P{^) 



Pii j) = 



p{j i) 



(8) 



spectrum.^ 



Therefore the ensemble of CPS with distribution P{i)/ Z 
is a fixed point of this process. Finally then, because 
each METTS | is produced deterministically from a 
CPS \i) in step 2, they will be generated with probability 
P{i)/ Z as well. 

In addition to generating METTS with the correct dis- 
tribution, the pure state method has other important 
properties that make it advantageous for performing sim- 
ulations. First, it may be defined in a completely general 
way, allowing one to choose a specific implementation 
based on the problem of interest. Next, it does not rely 
on a rejection method to perform sampling. That is, ev- 
ery METTS produced (in what is the costliest step of the 
algorithm) can be used to perform measurements. And, 
the algorithm is readily parallelizable since its sampling 
method is that of a Markovian random walk. 

Most importantly, the computational cost of the pure 
state algorithm is only modestly greater than that of 
ground state DMRG. The typical bond dimension m of 
the matrix product state representing each METTS pro- 
duced by the imaginary time evolution step ranges from 
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FIG. 1: Low temperature measurements of the (a) suscepti- 
bility and (b) specific heat of the S = 1 Heisenberg chain with 
iV = 100 sites. 



m = 1 for j3 = to m = mg for very large /3, where toq is 
the bond dimension necessary to represent the ground 
state using DMRG. And, as shown below, the imagi- 
nary time evolution for a system with TV sites scales as 
Nm^ per timestep. Therefore the cost of producing each 
METTS scales as Nm^P (ground state DMRG scales as 
Nm^). The CPS collapse step is much less costly, scaling 
only as as Nm? . 

We have therefore been able to reach significantly lower 
temperatures with the METTS approach than with pre- 
vious DMRG-based finite temperature methods. For ex- 
ample, Feiguin and White were able to simulate the 5=1 
Heisenberg chain down to about T — 0.05 using the an- 
cilla method, where about m = 500 DMRG states had to 
be kept at the lowest temperatures when using a trunca- 
tion error cutoff of 1x10""'^'^. This is because their matrix 
product states had to encode both the system and envi- 
ronment spins P 

In contrast, to reach comparable temperatures using 
METTS requires only about m = 60 states when using 
the same cutoff. This has allowed us to produce accurate 
results down to at least T = 0.02, as shown in Fig. [l] 



And, since the computational cost of METTS is com- 
parable to that of ground state DMRG, we expect that it 
can treat comparably large ladders and two-dimensional 
systems. This promises to be of great value for study- 
ing frustrated and fermionic models beyond one dimen- 
sion — especially models with non-trivial phases at finite 
temperature. 

Having defined the pure state method in a general set- 
ting, let us look more closely at each step. We will dis- 
cuss how to produce and sample METTS in detail and 
demonstrate that there is great flexibility within the ba- 
sic algorithm that can allow us to optimize it for our 
system of interest. 



A. Imaginary Time Evolution 

Since the imaginary time evolution is both the costliest 
step of the algorithm and the source of the most numer- 
ical error, we will begin by discussing some of the best 
methods for evolving a state in imaginary time and their 
relative advantages and disadvantages with regards to 
producing METTS. 

In what follows, we will find it convenient to repre- 
sent wavefunctions as matrix product states (MPS) and, 
in certain cases, operators as matrix product operators 
(MPO). A brief review of this formalism may be found 
in the Appendix. 



1. Trotter- Suzuki Method 

For a one dimensional system, the simplest way to im- 
plement time evolution is to use a Trotter-Suzuki decom- 
position of the time evolution operator, evolving each 
bond one at a time. When using this method for our 
ID benchmark calculations, we chose to do a second- 
order breakup as in the time-dependent DRMG method 
of White and Feiguin^. Assuming that the Hamiltonian 
consists of a sum of nearest-neighbor bond hamiltonians 
H = Hn, decompose e"'^^ as 

We emphasize here that the labeling of the terms in 
the Hamiltonian is arbitrary. However, for a ID system 
with nearest-neighbor interactions it is convenient to let 
Hn refer to the interactions on the n"" bond, making the 
operator decomposition above perfectly suited for a single 
DMRG sweep. Because each factor, or 'gate', in this 
breakup is a local operator, it can be computed exactly 
such that the only error in this treatment of the time 
evolution operator is the finite time step Trotter error. 
Then, each gate may be applied directly to the MPS 
representing the state of the system by using DMRG to 
expose the two sites around the bond to be time evolved. 

While the Trotter-Suzuki method is accurate and con- 
venient, it is only directly applicable to systems with 
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FIG. 2: Diagrams illustrating (a) the structure of a swap gate 
tensor, which is a product of two identity tensors and ex- 
changes the states of two lattice sites and (b) the application 
of swap gates to an MPS, allowing a Trotter time evolution 
gate to act on effective nearest-neighbor sites while actually 
evolving a system with longer range interactions. 



nearest-neighbor interactions along the path taken by the 
MPS through the sites of the lattice. This makes it hard 
to apply to 2D systems, including those with nearest- 
neighbor interactions. Even quasi-lD ladders and ID 
chains with longer range interactions cannot be treated 
by the above method without some modifications. 

Various methods have therefore been developed to ex- 
tend Trotter time evolution to such systems, for example 
Feiguin and White's method based on a Runge-Kutta dif- 
ferential equation solver.'^ However, such methods turn 
out to be too inefficient for our purposes. So, we will 
discuss some alternative methods below that are more 
effective for these challenging systems. 



2. Trotter- Suzuki with Swap Gates 

The first method extends the Trotter-Suzuki method 
by utilizing swap gates, a tool that is familiar in quan- 
tum computing. A swap gate S{ij) exchanges the states 
on two identical sites i and j. For a spin or boson sys- 
tem, S{ij) = dg s'.^s s'. ; the diagrammatic representation 
of which is shown in Fig. [2p,. Swap gates can be used 
for fermionic systems as well. In the case of spinless 
fermions, for instance, S{ij) takes the same form as for 
bosons, but with an overall minus sign multiplying the 
case in which both sites are occupied. 

Now, a generic Trotter-Suzuki breakup of the time evo- 
lution operator for a Hamiltonian containing only one site 
or two site interaction terms can be written in the form 



K K K 



(10) 



were the sites acted on by a given gate Kij are not nec- 
essarily nearest-neighbors along the path taken by the 
MPS through the lattice. However, Kij can be written 
in terms of a gate that does act on neighboring sites by 
using the identity 



(11) 



where the swap operators Q can be defined as a product 
of nearest-neighbor swap gates as 



(12) 
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The operator Kj_[j is the exponentiation of the same 
two-body term in the Hamiltonian as Kij, but acts in- 
stead on the sites j — I and j. 

To make use of the identity above for the purpose of 
applying the single gate Kij, one first applies the swap 
gates composing Qj^n to the original MPS |V'), yielding 
an effective MPS Then, the gate K^^llj may be 

acted on \tp') as a local operator. Finally, the swap gates 
composing are applied, restoring the original order 

of the sites. . 

Finally, when applying many Trotter gates Kij to an 
MPS, it is useful to observe that because S{ij)^ = 1, the 
ordering of the Trotter gates in the decomposition can be 
chosen such that many of the swap gates cancel. When- 
ever this is the case, the repeated pair of gates can simply 
be omitted from the algorithm, leading to a significant 
speedup. 

One way to arrange the Trotter decomposition to take 
advantage of this cancellation is as follows: taking j > i, 
order the Kij first by i, from left to right and then for 
each i, by j from left to right. Then all of the gates are 
applied in a single sweep that is arranged as a nested 
loop, where i is iterated over in the outer loop and j 
in the inner loop. In practice, all of the gates can be 
produced before applying any of them to the MPS so 
that cancellations can be found and the corresponding 
pairs of gates omitted. Even when taking advantage of 
cancellations, however, this method scales as LxLi^m^ for 
an Lx X Ly ladder instead of scaling proportionally to the 
number of sites as in the case of a chain. 
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FIG. 3: Diagrams illustrating (a) the left orthogonality con- 
dition Eq. ( 13 1 and (b) the right orthogonality condition 
Eq. p|. 



3. Imaginary Time Evolution with MPOs 

The remaining methods for time evolution we wiU dis- 
cuss are based upon matrix product operators, or MPOs. 
But, before we turn to the formation of the time evolu- 
tion MPO itself, let us first consider how to apply it to 
an MPS. 

A concept that will be useful is that of the orthogonal- 
ity center of an MPS. If we take the center to be at site 
i, then all of the MPS matrices A''^ such that j < i have 
been made left orthogonal in the sense that 

E (^^Ol^„,_,^a;„,a,=<5„;., (13) 

and all of the matrices A^^ such that j > i have been 
made right orthogonal such that 

V „ (A"^)^ , =Sa- ,a' . (14) 

Sj aj 

These conditions are illustrated in Fig. |3] 

When an MPS has a well defined orthogonality center, 
the set of A matrices to its left and right define an or- 
thonormal basis for each half of the system. The matrix 
at the orthogonality center then defines the coefficients 
of the wavefunction in this basis, that is 

\^)= E ^Z-.aM-i)h)\a^') (15) 

Siai-iOii 
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FIG. 4: The fitting algorithm for multiplying an MPO times 
an MPS. First, the tensors (a) L and R are formed by com- 
bining \iPa), the MPO W acting on it and the resulting state 
ji/'s) moving from the edges of the system toward the orthog- 
onality center. Then, tpB is defined by the tensor in (b) and 
split via SVD into matrices Bi and Bi+i. Finally, the orthog- 
onality center of |V's) is shifted and the process is repeated 
until convergence. 



where the site index — 1,2 ... ,d while the bond indices 
aj = 1,2 ... ,171. Alternatively, when performing two-site 
operations one can can take the orthogonality center to 
be a bond that can optionally include the sites adjacent 
to it. 

When performing the usual SVD or density ma- 
trix diagonalization in DMRG, one takes the tensor 
ip{ai-iSiSi+iai+i) representing the wavefunction and 
splits it into matrices A'^^ and A^'+'^ . These matrices 
give an optimal represention of the tensor ip and because 
they define expansion coefficients of the wavefunction in 
an orthonormal basis, the truncation of the wavefunction 
is also optimal. If there was no orthogonality center, or 
an SVD was performed away from the center, such an 
operation would optimally represent a local tensor but 
not an optimal truncation of the wavefunction. 

In applying an MPO W to an MPS \iPa) one destroys 
its orthogonality. Apart from issues of efficiency, the sim- 
plest approach to forming the product M^IV'a) would be 
as follows. Define the matrices B as 

= ^(%_ia._i)(".".) " E (16) 

for each site i. If the bond dimension of VF is fc and 
I^a) is TO, this defines a new MPS IV'b) = VF|V'a) of 
bond dimension mk if we think of the /?,; — (w^Ofi) as 
combined, or fat, indices. 



To guarantee proper truncation in this naive approach, 
one would first sweep left to right, performing SVDs to 
make the basis to the left orthogonal, but leaving the 
bond dimension as mk since the basis to the right is not 
orthogonal. Once at the right edge, the SVDs in the re- 
verse sweep could truncate the bond dimension to m or 
to some specified truncation error. However, this proce- 
dure scales as Nm^k^ and would be highly inefficient if 
fc - 10- 100. 

Verstraete and Cirac recommended a better approach 
(in a different context) that scales as Nm^k + iVm^fc^ 
based on fitting an MPS \iPb) to the product W\i1^a)^ 
Form a random MPS IV's) of bond dimension to and or- 
thogonalize it to have any arbitrary orthogonality center. 
The optimal two-site wavefunction ■(/'(/Si-iSiSi+i/Ji+i) at 
this center is then found by forming tensors L and R rep- 
resenting the product in the basis for the left and right 
halves of the system and combining them with the lo- 
cal W and A matrices as in Fig. [4] This iIjb minimizes 
II IV'b) — W^IV'a)IP- One may then split i/'s into matri- 
ces i?*' and and sweep back and forth through the 
system, repeating this process until it converges. 

Verstraete and Cirac originally advocated a single-site 
version of this approach but we have found the two-site 
version better — for example, the value of to can be auto- 
matically set to achieve a specified truncation error much 
more readily, particularly if one is keeping track of con- 
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FIG. 5: The tensors (a) Ci and (b) C2 generated in the first 
two steps of the zip-up method for multiplying an MPO times 
an MPS. Indices grouped by parentheses are to be thought of 
as a single fat index for the purpose of computing a Singular 
Value Decomposition. 

served quantum numbers. However, even when using the 
two-site version, this fitting procedure may fail to con- 
verge for systems with non- nearest-neighbor interactions 
in a reasonable number of iterations. The reason seems 
to be related to the "sticking" problem of DMRG. When 
two distant sites are strongly entangled, the method has 
trouble building up this entanglement when improving 
the state near each site one at a time. On the other hand, 
the naive method above does not have such a sticking 
problem. 

We have therefore developed an alternative approach, 
the zip-up algorithm, that has similar scaling to the fit- 
ting method but which does not get stuck. We begin 
by identifying the MPO W with an MPS \W) by com- 
bining its primed and unprimcd (i.e. bra and ket) site 
indices. The orthogonality centers of \iPa) and \W) are 
then moved to the first site. This arrangement guaran- 
tees that the product of W and lip) at the first site has a 
basis to the right that, while not orthogonal, is not dras- 
tically ill-conditioned. By this we mean that while some 
linear dependence of the basis can exist, no basis state is 
expected to have a norm that is significantly bigger than 
one (norms much less than one are less of a problem). 

Then, starting from the first site, we form the product 
MPS I'i/'s) as follows. Define the tensor Ci as 

Thinking of Ci as a matrix Cg-^ (^i^-^ai)j where (wiai) is 



a combined or 'fat' index, one may compute its Singular 
Value Decomposition (SVD) as 

Csi (^lai) = Us,(S,A^,V^^ . (18) 

At this step, it is crucial to perform a truncation, keeping 
only the m largest singular values contained in Ai. This 
may be done by setting m explicitly or by specifying a 
cutoff on the truncation error. After truncating, the first 
matrix of \iJjb) is defined as -B^^ = Us^^p-^ and is therefore 
left orthogonal by construction. 

The remaining B matrices may now be found recur- 
sively. To go on to the next step, one computes 

^/3iW2a2 ^ $Z Ujj^^,^Cui\aiWuij^J2^aia2 ■ (19) 
s'^ ,uJi ,ai 

this time thinking of the result as a matrix C'(^iS2)(i<J2a2) 
and computing its SVD to find C^(^iS2)(92- T^^^ matrix 
B'^^ may then be set to -B^^^^ ~ ^(;3iS2)ft ^^'^ the process 
continued by defining C3 — C'^^tJsQs analagously in terms 
of U2, C2, A3 and W3. 

For clarity, the construction of the C tensors is de- 
picted graphically in Fig. [5j Asymptotically, if one as- 
sumes that k < m, this algorithm scales as Nm?kd. 

We note that when performing the SVDs in the above 
procedure, it is important to use a truncation error cut- 
off versus a fixed m during the initial left to right sweep 
to compensate for the modest loss of efficiency stem- 
ming from the fact that each truncation optimizes a local 
tensor, not itself. Once the computation of IV's) 

reaches the right boundary, the return SVD is fully effi- 
cient since the B matrices are left orthogonal and during 
this return sweep m can be reduced to an optimum value. 
For typical 2d clusters, we find that such a procedure 
leads to an intermediate m that is about twice m. 

Finally, it may be useful to combine the above two 
approaches. One could perform an initial left to right 
sweep using the zip-up algorithm with a fixed m ~ m 
to obtain an initial guess for \iI^b) and then switch to 
the fitting algorithm for the return sweep (and possibly 
additional sweeps) until convergence is reached. 

4. Taylor Series Construction of the Time Evolution MPO 

Having discussed the application of an MPO to an 
MPS, let us now turn to the construction of an MPO rep- 
resentation of the time evolution operator Kt = e~'^^ . 
The first method we will discuss relies on expanding Kr 
in a Taylor series. 

To construct the MPO for Kr using this approach, 
one begins by constructing an MPO representation of 
the Hamiltonian H. While it is sometimes possible to 
work out the MPO for H explicity, such as in the case of 
translationally invariant ID systems, more complex sys- 
tems can be treated by constructing MPOs for each term 
in the Hamiltonian and summing them together (see the 
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Appendix for further details on adding and multiplying 
MPOs). 

Having found an MPO representation of H , the Tay- 
lor series approach begins with breaking the timestep r 
into small fractions e = r/n. One may then approximate 
Kf^ ~ e^'^^ in a Taylor series which is truncated at a high 
order p such that the error is controlled by e^. Finally, 
this series may be summed by utilizing MPO multiplica- 
tion and addition algorithms. In doing so, it is important 
to arrange the computational steps so as not to repeat 
any expensive calculations such as computing powers of 
H. One such arrangement is given by the following algo- 
rithm. 

Compute the MPO Wq = — eiJ. Then compute the 
MPOs Wj recursively as 

W, = -eH{l + —^^W,^,) (20) 

where j — 1, . . . ,p. The timestep operator itself is then 
given by = 1 4- Wp. 

Finally, the MPO for Kr may be found by computing 
the n"' power of ifg using the zip-up method for MPO 
multiplication described in the Appendix. In doing so, it 
is helpful to choose n to be a power of two so that 
may be computed by by repeatedly squaring K^. In this 
way fewer MPO multiplications are required. 



generally possesses at least one eigenvalue that scales as 
iV, while other eigenvalues scale as with p <1. 

In contrast, the Trotter-Suzuki approach to computing 
K does not suffer from this difficulty since one works with 
gates acting only on pairs of sites, and such gates may 
be constructed exactly. The only sources of error that 
remain, then, are the finite timestep Trotter error and 
the truncation error in the construction of the MPO. 



B. Measurement of Observables 

After producing a METTS, one wants to measure a 
number of bulk quantities, such as the energy, as well as 
local observables, such as the magnetization on each site. 
For bulk observables that consist of local terms summed 
over the entire lattice, it is usually convenient to use an 
MPO representation. We must therefore find an efficient 
way to compute expectation values of such MPOs. 

On the other hand, it is more efficient to work with 
local observables explicitly, and to compute their expec- 
tation values only in terms of the orthogonality center of 
the MPS. And, as we will see later, this second approach 
is particularly important as it is the key to performing 
an efficient CPS collapse in the last step of the pure state 
method. 



5. Trotter- Suzuki Construction of the Time Evolution 
MPO 

An alternative method for constructing the time evolu- 
tion MPO K — e^'^^ is to make use of the Trotter-Suzuki 
decomposition of Eq. ^ , where K is approximated as a 
product of gates Kij. But instead of applying the gates 
directly to the MPS representing the system, one com- 
bines them together to form an MPO for K. 

Now, explicitly writing an MPO representation for a 
gate Ki j can be challenging as it is not a simple product 
of single site operators and will not generally act on sites 
adjacent to each other in the MPS. However, one solu- 
tion is to use the swap gate technique described above. 

(ii) 

One first works out the set of swap gates and local Kj_[ ■ 
gate operators that form K and removes any redundant 
gates. Then the MPO for K may be computed by cre- 
ating an identity MPO and applying the gates directly 
to it, taking care to truncate the MPO bond dimension 
after each gate is applied. Or, if even greater accuracy 
is required, one may follow the steps above to construct 
an MPO for = e~^^ with e = t/2" and then use the La^^^p^ 
zip-up method to square it n times and obtain K. 

An important advantage of this Trotter-Suzuki method 
over the Taylor series approach is that errors are easier 
to identify and control. In particular, we have found that 
the parameters controlling the error in the Taylor series 
method must be reduced whenever the number of lattice 
sites N is increased in order to maintain the same accu- 
racy. This can be traced to the fact that the Hamiltonian 



1. Computing the Expectation Value of an MPO 

Operators formed by summing over local terms, most 
notably the Hamiltonian, can generally be written as 
an MPO with relatively small bond dimension k. Since 
we will need to compute the expectation value of these 
MPOs with respect to MPS that have a relatively large 
bond dimension to, let us see how to do this efficiently. 

Consider computing the matrix element {iI}a\W\4'b) 
where W is an MPO of bond dimension k, and \^a) and 
\4'b) are MPS of bond dimension to. Of course, this 
reduces to an expectation value if \4'a) = IV's)- 

One starts the computation at the beginning of the 
chain, defining Li as 



(21) 



siti 



Then L2 can be produced from Li by computing 



E 



E 



CX.\CX.2 



B 



(22) 



Here the order of operations in performing the sum is 
crucial to ensure good scaling - in particular one should 
not sum over both the ai and /3i indices simultaneously. 

One then proceeds to define L3 in terms of L2 in a 
similar manner until the end of the chain is reached. In 
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this way, one computes the matrix element exactly as 
there is no truncation step. If the steps of the algorithm 
are followed in the order given above and we assume that 
k < m, the computation scales as Nm?kd. 



2. Computing the Expectation Value of a Local Operator 

While any operator may applied to a state by repre- 
senting it as an MPO, it is often more efficient to compute 
the action of local operators on an MPS explicitly. The 
basic idea is shift the orthogonality center of the MPS 
(defined in section |II A 3 above) such that it lies within 



the set of sites on which the local operator acts. This way 
any MPS matrix on which the operator acts trivially is 
excluded from the computation. 

For example, consider computing of the expectation 
value of a single site operator Oi acting on site i. If 
we take the orthogonality center to be the site i, then 
because the matrices A"^ for all j < i obey the left or- 
thogonality condition Eq. ( 13 1 and the matrices A"^ for 



all j > i obey the right orthogonality condition Eq. (14 1, 
the expectation value of Oi reduces to 



(23) 



Or, if we are interested in applying the operator to the 
state to obtain OiltpA), this can be done by making the 
replacement 



Ai 



(24) 



while keeping all other A matrices fixed. 

Expectation values of multi-site local operators such 
as {ipAlOiOi+ilipA} may be calculated in a similar way. 
Assuming that the orthogonality center is at site i or 
i + one now includes both the matrices A^' and A'^'+i. 
However, in contrast to the single-site computation, it is 
important in such multi-site expectation values to break 
up the tensor contractions such that the scaling of any 
one operation never exceeds m^d. 

Finally, while each local expectation value may be 
computed with a scaling no worse than m^d, one usu- 
ally wants to perform such a calculation at each site in 
the system. Since this necessarily involved shifting of 
the orthogonality center, the overall scaling will then be 
Nm^d"^ as an SVD must be computed at each step. 



This is because correlations exist both between sequen- 
tial METTS and between related observables within a 
METTS. 

As we will discuss in the next section, the autocorrela- 
tion time of the METTS pure state method can be made 
quite short (less than 5 steps or so) by properly choosing 
the basis into which the CPS are collapsed. However, 
when estimating the error in a series of expectation val- 
ues, it is still best to remove the effects of correlations by 
using a binning procedure as follows. 

After calculating expectation values of an observable 
A, one collects sequential values into bins whose size is 
larger than the autocorrelation time. The average value 
within each bin is then computed. Each bin average may 
now be taken to be statistically independent and the error 
estimated as the standard error of these bin averages. 

However, for quantities such as the specific heat or 
magnetic susceptibility, given by Cy — ^[(iJ^) — (H)'^] 

and X = N [i*^^) ~ ("^)^]) computing the error as the sum 
of the errors in the first and second moments greatly 
overestimates the true error even when using binning. 
This can be traced to the fact that, from one step of the 
pure state algorithm to the next, the first and second 
moments of a bulk observable A are strongly correlated. 
That is, if (A) is relatively large for a particular METTS, 
so is {A"^). 

A simple way to deal with this difficulty is therefore to 
use a resampling method, such as the bootstrap method 
but with correlations taken into account. In other words, 
from the entire set of A expectation values, one randomly 
samples a new set of the same size as the original but 
with replacement, such that a value may appear more 
than once. One then calculates (^4) for this resample. 
But then, instead of repeating this procedure indepen- 
dently for {A"^), one resamples instead the same subset 
of METTS. So, if expectation value of A calculated from 
a particular METTS is included in the resample n times, 
the corresponding expectation value of A^ is included n 
times as well. 

The bootstrap method then consists of repeating the 
above process a large number of times, each time obtain- 
ing an estimate for the second cumulant (A^) — {A}'^. If 
the process is repeated enough times, the resulting distri- 
bution of second cumulant estimates should be approxi- 
mately Gaussian. Finally, the error in the second cumu- 
lant may be estimated as the standard deviation of this 
Gaussian distribution. 



Collapsing METTS into Classical Product 
States 



3. Estimation of Errors 

While the methods discussed above give an efficient 
way to calculate observables for a single METTS, one 
must also consider the properties of the METTS ensem- 
ble when estimating averages and computing error bars. 



After generating a METTS by imaginary time evolu- 
tion and calculating observables of interest, the METTS 
must be collapsed into a CPS so that a new METTS can 
be generated with the correct probability. This can be 
done in any basis, a fact which one can exploit to reduce 
the autocorrelation time of the pure state algorithm and 
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to accelerate the measurement of certain observables. 

It is also important, for reasons of efficiency, to carry 
out this collapse for each site one at a time. We will 
therefore discuss how to collapse a METTS in practice 
and show that the resulting algorithm scales as Nrn^cP. 



be written explicitly as 
P,(0) = l-{h-S,f 



(28) 



1. CPS Collapse Algorithm 

Since collapsing METTS is simplest for the case of a 
spin 1/2 model, let us begin by considering this case be- 
fore discussing the general method. 

Say one wishes to collapse the spin of a pure state \^) 
at site i along the n axis. Define the states 1+) and |— ) 
at site i such that h ■ Si\±) = ±i|±). Then the proba- 
bility of finding the spin at site i to be in the |-|-) state 
is p+ = {^\fi ■ Si\^) + \ and to be in the |— ) state is 
p- = 1 -p+. 

After determining the new state for site i using a ran- 
dom number generator, the actual collapse is enacted by 
the projection 



-1/2 
P+ ' 

-1/2 
P- 



-){- 
-){- 



l^*) prob 
1^') prob p_ 



(25) 



Finally, once this procedure is carried out for every site 
in the system, one obtains a CPS j^*') with probability 

Now consider the more general case of a lattice system 
where the local Hilbert space at each site i can be ex- 
panded in an orthonormal basis \m)i with m = 1, 
We again emphasize that this basis can be chosen arbi- 
trarily for each site i and for each step of the algorithm. 

Define Pi{m) = \m)i{m\i as the projection operator 
into the state |m)j. Then the probability of finding site 
i to be in the state \m)i is just Pm = {'^\Pi{m)\'i') . 

Having selecting a particular state |m)i according to 
this probability, one completes the collapse by making 
the replacement 



I*) 



P 



-1/2 



(26) 



Before discussing the detailed implementation of this 
method for MPS, let us conclude with some explicit ex- 
amples of projection operators. From the above discus- 
sion, we can immediately identify the projectors for the 
5 = 1/2 basis states and |— ) as 



fi ■ S, 



(27) 



Or, if we consider an S" = 1 system and take \m)i with 
TO = 1, 0, 1 to be the eigenstates of n • 5, the Pi{m) may 



2. Collapsing an MPS Into a CPS 

Having defined the CPS collapse algorithm in a gen- 
eral setting, let us see how to apply it to a state \tpA) 
represented as an MPS. First, assume that the orthogo- 
nality center of the MPS has been chosen to be the first 
site. In other words, assume that the matrices A'^^ such 
that j — 2,3, ... ,N have been made right orthogonal as 
in Eq. 



One begins the collapse by first computing the expec- 
tation values of the projectors Pi(to). As discussed in 
section [II B[ the existence of a well defined orthogonality 
center allows us to compute these expectation values in 
terms of the matrix A'^^ alone. Explicitly, one computes 

(^^|Pi(to)|^^) = iA^'^)l{^'l\Plim)\s^)AZ 



E 



J2iA^'^)iMH 



(29) 



Once the new state of site 1 is chosen to be |si) = |toi), 
say, with probability pmi = ('0^|^'i(wi)|'0a), the state 
could be collapsed by naively applying the projector 
Pii'fTT-i) to it. However, while the action of a generic 
single-site operator would be followed by an SVD to or- 
thogonalize site 1 and turn site 2 into the center site, 
the fact that the CPS collapse operators are projectors 
allows for an even more efficient algorithm. 

Because the repeated action of single-site projectors on 
any state produces a product state and a product state 
can be represented as an MPS with bond dimension 1, 
if the above step of acting a projector Pi(toi) on site 
1 and left orthogonalizing A'^^ were performed, the new 
bond index a'l connecting to yl*^ could be truncated 
such that it only takes on one value. Anticipating this 
result, one can therefore directly replace A'^'^ with the 
1x1 matrix A^^ — (si|toi). Finally then, to ensure that 
Pi{mi)\tpA) still describes the same state of the remain- 
ing sites not acted on by the projector, one must replace 
according to 



SiQfi 



[ni,\s{)Al\Al\^^ 



(30) 



That is, the label toi plays the role of a new bond index 
that runs over just one value. In fact, because it plays 
a trivial role, this label can be dropped with the result 
that the new state of the system is a product of a single- 
site wavefunction for site 1 and an — 1 site MPS for 
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FIG. 6: The energy of the L = 50, S = 1 Heisenberg chain 
at P — 1.0 found by choosing a completely random CPS (like 
those produced in a /3 = simulation) and measuring the 
energy of a METTS produced after a given number of steps 
of the pure state method. The symbols show the ensemble 
average of many such runs each starting from a different CPS 
(the line is the QMC result treated here as exact). Depend- 
ing upon the basis chosen for the CPS collapse, the number 
of steps needed for the effects of the initial CPS choice to 
be removed varies widely. Measuring only along the z axis 
produces the strongest autocorrelation effects while choosing 
random axes or mutually mixed bases gives much better per- 
formance and an autocorrelation time of less than 4-5 steps. 



the remaining sites. Having made this identification, the 
second site now lies on the boundary of an MPS and can 
be collapsed in exactly the same way as the first. 

Although applying a local operator typically scales as 
m^(f per site, the considerations above show that the 
CPS collapse step of the pure state algorithm can be 
optimized even further. The result is an algorithm that 
scales as m^(P where the costliest step is the formation 
of the new A matrix in Ecyi. 



(30) 



3. Choosing the CPS Basis for Spin Models 

When collapsing a METTS into a CPS, one has the 
freedom to choose the CPS basis arbitrarily for each 
METTS and at each site. We would therefore like to 
exploit this freedom first of all to ensure ergodicity and 
also to minimize correlations among the METTS. 

One simple strategy for improving ergodicity is to pick 
a random quantization axis fi for each site at every step, 
and collapse the spins into the basis of h ■ S eigenstates. 
Then if the Hamiltonian conserves a quantity such as 
the total z magnetization S^^^, such random measure- 
ments will generate CPS which explore many different 
S'fot sectors. Moreover, random projections do not ex- 
plicitly break rotational symmetry in the sense of favor- 
ing a particular axis from the outset. And, we have found 
that in practice random projections work quite well and 



give a short autocorrelation time as shown in Fig. [6] 

In contrast, one could instead collapse every site along 
the same axis, say z. This has a number of advantages, 
including being simpler to code and making measure- 
ments of diagonal operators easier to perform, as we will 
discuss below. One particularly nice feature of using a z- 
only basis is that, since all of the collapsed spins lie along 
the same axis, the resulting CPS can be easily visualized 
and plotted. 

For example, in Figs. [7] and |8] we show the CPS ob- 
tained after collapsing subsequent METTS in simulations 
of the Heisenberg and AKLT models and mark points at 
which the spins fail to follow the diluted Neel pattern 
that underlies the string order in the Haldane phase. It 
is interesting to see the number of defects decrease as 
the temperature is lowered, and to observe that for the 
AKLT case, they disappear entirely as T — ?> 0. 

A serious drawback to the z-only approach, however, is 
that it leads to problems with ergodicity. In practice, us- 
ing the z basis only tends to lengthen the autocorrelation 
time, reducing the efficiency of the algorithm for calcu- 
lating thermal averages. For example, one can clearly see 
strong correlations in the positions of the defects plotted 
in Figs [7^ and [8^. And, correlations in energy measure- 
ments can persist over many steps as shown in Fig. |6] 

Since we would like to retain the advantages of the z- 
only basis but cannot afford to sacrifice ergodicity, one 
solution is to collapse METTS along the z axis only on 
even numbered steps. If we then collapse along random 
axes after odd steps, we can maintain a short autocorre- 
lation time. 

This compromise suggests an even better approach, 
however. Instead of switching to random axis CPS col- 
lapses, we can use a basis that is maximally mixed rel- 
ative to the eigenstates. For example, in a S" = 1/2 
system one would perform odd step collapses along the 
X axis. As eigenstates are in an equal superposition 
of S"^ eigenstates, the autocorrelation time is expected to 
be minimal since any memory of the previous CPS will 
be completely erased. 

Likewise, for 5 = 1 systems, one can perform odd step 
collapses in the maximally mixed basis given by 



1 



V3 
1 

w 

1 



i2ir/3| 



|0> 



|m2> = ^(|i)-|0)-|T)) 

IM3 



-i2ir/3 



|i)) 



-i2ir/3| 



1 - -e 



i2ir/3| 



1)) 



(31) 



As shown in Fig. |6j using this basis on odd steps reduces 
correlation effects even more effectively than the random 
axis approach, for which the autocorrelation time is al- 
ready quite short. 
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FIG. 7: Classical product states produced by performing 
z-only measurements of METTS for the 5=1 Heisenberg 
antiferromagnet with N = 100. Shown are the middle 60 
sites for (a) /3 = 1.0, (b) (3 = 5.0 and (c) /3 = 20.0. The 
vertical red lines mark defects in the diluted Neel order of the 
Haldane phase. 



4- Diagonal Measurements with Product States 

While the cost of computing expectation values of 
quantities such as the energy or the magnetization is 
quite low for each METTS produced, other measure- 
ments can be very costly. 

For example, we may want to measure real space spin 
correlations C(A) = ^^iSi ■ S^+a)- One approach would 
be to create an MPO that encodes the sum over all pairs 
of spin operators separated by A. Or, we can work with 
MPOs for the individual spin operators and perform the 
sum above for each METTS. In practice however, we 
have found that both approaches are very costly because 
of both the time needed to construct the MPOs and to 
compute the expectation values. 

But, assuming we are studying a spin rotationally in- 
variant system, only the {SfS^) correlations are needed 
as the others are equal by symmetry. And, since Sj is 



(a) 

oT[rioooTiToi|ir-lT4.To|nT4,Tio[i|iToo^o4,TiTo4.Ti|4.nT[nriToionoTiro|T 
oToooo|ToiToi|iTiT4.To|nT-lTio[l|iToo^ooioTo4,Ti|j.nT[rooiToiooooTiTo|T 
oTinTo|TiToi[lTiTiTo|TJ.T4,Tio^|iToo^ooioTo[r4,[l[iToo[rooioTiooooTiTo[r 
oTiTiTo^iToi[inTioT|TJ.T-mo^ooTiToooiTo[ri[l[iTo[riToionTJ.ooTiro|T 
oTiT-iTo^ioTi|iTiT4.oT|nT4-Toi[ioonToooiTo[r4.|i|iTo[rioro4,TinoToiro|T 
oTiTiTo[roiTi[lTiT-loT|TiToioo|iooTiTooioTo[r4,[loooToooo4.TiTiooooTo|T 

(b) 

TiT|iToito|iTiToioToinooToioTioro4,oTiTooiTinTooiTmnoTionT 

Tio|DToionTi|ToiooooonTinnonnTioToooinoTioTo4.TinTooiTio 

oTioT|oiTooo|oinTiTiTo4,oToo|ToiTo4-oTo|TioTomoTUoioToo4,Too|o 

Tiroi|nn|oonnToiTooo4,TitiTo[iTo-i[TiroiToinnootiooonoTitm 
(c) 

TinToiToiTiToioToinooToioTioToioTnoo4,TinTooinTinoTionT 

4-ToiToinnnoonTomTinnnnoTionTiTionnToioonoooonT 

TiootoiotiTiToioooooTiTiTiTioTiTJ-TioToooiTioTioToiTiTiTooiTio 
J-TionTmooTooioTiTooiTmooToiontoioTinTioTioooTrnTiToio 

TiToinnooTiUToiToooiTiTiToiTo-iTiToiToinnoonooonoTinn 

FIG. 8: Classical product states produced by performing 
z-only measurements of METTS for the S = 1 AKLT an- 
tiferromagnet with A'' = 100. The temperatures and symbols 
are identical to those in Fig. [7] 

diagonal in the product basis of 5^ eigenstates, we can 
take advantage of this fact to perform correlator mea- 
surements using the CPS collapsed from each METTS 
instead of using the METTS themselves. As long as the 
operator we wish to calculate is diagonal in the basis of 
the collapsed CPS, we will still obtain the correct thermal 
average. 

So, if one performs z-only collapses as discussed in the 
previous section, the operator content of measuring an 
diagonal observable such as C^(A) = J2i{^i^i+A) ^^^'^ 
be ignored entirely. That is, if we collapse a METTS 
into a CPS j^*) in the z basis such that 

I*) = |si)|s2)|s3)...|siv), (32) 

then we immediately have C^(A) = Si Sj+A- 

One can make even better use of the resources ex- 
pended in producing each METTS by re-collapsing it 
multiple times and measuring each resulting CPS. This 
procedure also produces the correct thermal average for 
diagonal observables and is much more efficient for large 
/3 than collapsing each METTS only once. By using the 
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FIG. 9: Properties of a METTS produced for the 100 site 
S — 1 Heisenberg chain at T = 0.1, central 80 sites. In the 
main plot, the solid lines (red, green and black) show the 3 
components of (S) while the (blue) boxes show \{S)\. The 
entanglement entropy on each bond is shown in the top inset, 
while the expectation value of each Hamiltonian bond term 
is shown in the bottom inset. 



pure state algorithm in this way, one explicitly samples 
not only the thermal fluctuations, but the quantum fluc- 
tuations as well. 

Finally, since the above method relies on performing 
CPS collapses along the z axis, it is susceptible to the 
same ergodicity issues mentioned earlier. However, er- 
godicity can be restored by alternating between z axis 
collapses on even steps and random or maximally mixed 
collapses on odd steps. And, for the S* = 1/2 case, since 
the maximally mixed basis is just the x basis, we can 
perform measurements of quantities like C(A) for rota- 
tionally invariant models on every step by treating the x 
basis as an effective z' basis. 



III. PROPERTIES OF METTS 

While the pure state algorithm is effective for comput- 
ing thermal averages, it also generates an entire wave- 
function at each step that may be thought of as a snap- 
shot of the system. Every METTS it produces can be 
used to simultaneously calculate many different observ- 
ables and look for characteristic fluctuations or short 
range order. 

Studying the properties of METTS can also help us 
to understand why relatively few METTS are needed 
to accurately estimate observables. By identifying the 
METTS ensemble which maximizes this sampling effi- 
ciency for a simple model, we will discover that while the 
entanglement entropy of such an ensemble is not always 
optimal, it may be minimal in a more restricted sense. 




20 40 60 80 

Site Number 

FIG. 10: Properties of a METTS produced for the 100 site 
S = 1 AKLT chain, central 80 sites. The temperature and 
symbols are the same as those in Fig. |9] 

A. Observing Thermal Fluctuations With METTS 

Figures |9] and [TO] each show the properties of a sin- 
gle METTS produced for the spin one Heisenberg and 
AKLT chains, respectively. The temperature is taken to 
be T = 0.1 in the units of Eqn. [§ 

Throughout both chains, antifcrromagnetic spin cor- 
relations are clearly visible but thermal fluctuations are 
also apparent. For example, the AKLT chain in Fig. [lO| 
shows longitudinal fluctuations in the magnitude of {S) 
around site 40. In the inset, one can see that the local 
bond energy is also correspondingly higher at this point 
in the chain. This may be understood as a fluctuation 
away from the spin liquid ground state in which (S) = 0. 

A second type of thermal fluctuation that is visible is 
a twist in the staggered magnetization, such as the one 
near site 30 of the Heisenberg chain of Fig. |9] Near this 
site, the spin length is also shortened and the local bond 
energy is relatively low. After the twist, a second region 
of correlated spins persists for a few correlation lengths 
until another twist occurs near site 70. 

It is interesting to observe that since each METTS is 
a pure state, the von Neumann entanglement entropy 
for bipartitions of the system about each bond is well 
defined, and we normalize it here such that a spin 1/2 
singlet has an entropy of 1. As Figs . [9| and \lO\ show , in the 
regions where the spins are more classical and the bond 
energy is relatively higher, the entanglement entropy is 
lower, signifying weaker quantum fluctuations. 



B. Energy Measurements and the Efficiency of 
METTS 

Let us now turn our attention from the properties of 
individual METTS to ensembles of METTS. An impor- 
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FIG. 11: The specific heat of the S = 1 Heisenberg chain 
with N = 100 sites calculated using METTS with each CPS 
collapse performed along random axes. Also shown is the 
quantum specific heat, that is, the average estimate of the 
specific heat as calculated from a single METTS. 



tant concept will be that of a thermal decomposition^ by 
which we mean a set of pure states \4>{i)) and weights 
P{i) such that 



J2p{^)\mm)\ 



(33) 



In fact, all density matrices can be decomposed in this 
way and the \(t>{i)) need not be orthogonal, though they 
should be normalized. From Eqs. ([3| and Q, one sees 
that every METTS ensemble corresponds to such a de- 
composition. 



Implicit in Eq. ( 33 ) is a division of thermal effects into 



classical fluctuations quantified by the weights P{i), and 
quantum fluctuations present within the states !</>(«)), al- 
though this distinction is somewhat arbitrary as there ex- 
ist many different thermal decompositions. But, because 
we are interested in sampling the \<t>{i)) rather than cal- 
culating them all, our sampling process will be especially 
efficient when most of the thermal effects can be cap- 
tured within the states themselves, rather than in their 
distribution. 

To make this intuition more precise, let us consider 
the energy fluctuations within a given thermal decompo- 
sition. Using the notation {A)i = we may 
write the specific heat as 



N 



E 



p(i) 



E 



H) 



(34) 



However, the fact that we are working with an ensemble 
of pure states also allows us to define a 'quantum specific 
heat' 



(7(9) = 



N 



z 



(35) 



that quantifies how much each state |0(i)) contributes to 
the full specific heat on average. 

Now it turns out that, up to a constant factor, the dif- 
ference between the quantum specific heat and the total 
specific heat is nothing but the variance of the expecta- 
tion value of the energy within the ensemble, that is 



N 
N 



(36) 



E 



E 



P(^) 



H) 



So, the smaller this difference, the fewer the number of 
states that must be sampled to estimate the energy to a 
given accuracy. 

Now turning to a particular choice of a thermal decom- 
position, let us first consider the exact energy eigenstates 
of the system, taking |^(i)) = je^) and P{i) = e~^'^^. 
Since H is diagonal in this basis, it follows that 
{H'^)i = {H)i- Therefore the quantum specific heat is 
precisely zero and, as a result, the variance in the en- 
ergy is maximal. So even if one could sample the exact 
energy eigenstates of the system, they would make an 
exceedingly poor basis for the purpose of estimating the 
average energy. 

Next, consider a decomposition of the thermal ensem- 
ble in terms of METTS. From the calculation of both 
the specific heat and the quantum specific heat shown in 
Fig. 11 for the S — 1 Heisenberg chain, we can see that. 



even near the peak where the difference between Cu and 
C^'^' is greatest, the quantum specific heat is only smaller 
than the total by about a factor of 0.8. This indicates 
that the METTS decomposition we sampled is quite close 
to being optimal. Physically, we may attribute this effi- 
ciency to the fact that, for a given temperature T, each 
METTS has a significant overlap with all of the energy 
eigenstates \ei) such that < T. 

Finally then, let us see what would happen if we use 
a decomposition of the thermal ensemble in which this 
overlap with the energy eigenstates is maximal. Consider 
the overcomplete, unnormalized basis of states 



10) = E 



(37) 



each parameterized by the set & = {^'i}- Then the nor- 
malized states given by 



10(e)) 



1 



|Q)-:^E 



e^) (38) 



form a decomposition of a thermal ensemble with uniform 
weights ^"(6) = const. If we now imagine sampling even 
just one such state, the resulting estimate of the energy 
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would be exact, since 



Z 



(39) 



In a similar manner, one may obtain the exact specific 
heat from just one of these states as well, which implies 
that C^^-* = Cy for this decomposition. Formally, then, 
this particular basis is the most optimal one for sampling 
the average energy of the system. However, constructing 
even one state |0) amounts to diagonalizing the entire 
Hamiltonian, which would render a sampling approach 
irrelevant. 

In the end, then, the METTS decomposition achieves 
both a remarkably high sampling efficiency in practice, as 
quantified by — , while also being one of the least 
costly decompositions to produce. 



C. In What Sense are METTS Minimally 
Entangled? 

Consider a system consisting of just two spin 1/2 mo- 
ments, A and B in a pure state \tp). We call |V') entangled 
if it cannot be factorized, that is, if it is not a CPS. 

To compute the entanglement entropy <S'[^/'] one first 
forms the density matrix pA describing spin A alone by 
tracing out the states of B 



PA = TTB\^){tp\ 



(40) 



One may define pB likewise. In terms of these reduced 
density matrices, S[ili\ is then 



'Tr[pA log2(py 



-Tr[pB log2(pB) 



(41) 



At finite temperature, however, A and B are not in a 
pure state |?/'), but in the mixed state described by the 
density matrix p = ^e~^^ . While for a mixed state one 
can no longer define an entanglement entropy in the same 
manner, one could choose a particular decomposition of 
p as in Eq. (33) and compute its average entanglement 
entropy 



p{i) 



(42) 



However, S^[p\ turns out to depend upon which decom- 
position we choose. 

A more meaningful measure of mixed state entangle- 
ment is the entanglement of formation E[p] defined as 
the minimum value of 5'^ [p] amongst all possible decom- 
positions^ 



E[p] 
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FIG. 12: The average entanglement of the ZZ and XZ METTS 
density matrix decompositions for the two site S — 1/2 
Heisenberg antiferromagnet. The XZ METTS form the opti- 
mal METTS decomposition of the system, but their average 
entropy exceeds the entanglement of formation E[p]. Also 
shown is the average entanglement of the energy eigenstate 
decomposition. 



While such a minimization would be difficult to com- 
pute in general, an exact expression has been found 
by Wootters for the case of a two qubit system.^ So, 
for this simple system at least, we may unambiguously 
check whether the average entanglement entropy of the 
METTS decomposition saturates the lower bound set by 
E[p]. In other words, we may explicitly check if METTS 
are minimally entangled. 

Before we make this comparison, though, let us review 
Wootters' method for computing the entanglement of for- 
mation. First define the concurrence of a pure state lip) 
of two qubits as 



cm = 1(^1^)1 



(44) 



where 1-0) is the time reversal conjugate of 1-0)- Concur- 
rence is an alternative measure of entanglement, and may 
be used to compute the standard von Neumann entan- 



glement entropy Eq. (41) through the relation 



(45) 



where f{x) = — xlog2(a;) — (1 — x) log2(l — x). 

Now for a general mixed state of two qubits, it may be 
shown that there exists at least one optimal decomposi- 



tion of p which saturates the minimum in Eq. ( 43 ) and 
which consists of pure states all having the same concur- 
rence, and hence the same entanglement entropy. This 
minimal concurrence value is given by 



C[p] = max{0, Ai - A2 - Ag - A4} 



(46) 



where the A^ are the square roots of the eigenvalues of 
(43) the matrix p(ay (E) (Ty)p*{ay ® ay). It therefore follows 
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that the entanglement of formation of a two qubit system 
is given by 



E[p] = / 



(47) 



Equipped with this expHcit formula for the entangle- 
ment of formation, let us now consider a specific sys- 
tem, namely the two site S — 1/2 Heisenberg model with 
Hamiltonian 



H = J Sa-Se 



(48) 



and take the antiferromagnetic J > case. 

As /3 — )• oo, the unique ground state |^o) of H is the 
singlet formed out the the spins A and B, which has an 
entanglement entropy 5'['0o] = 1- So, the entanglement 
of formation must also be E[p\ = 1 at zero temperature. 
In the opposite limit as /? goes to zero, E[p] must go to 
zero. As shown in Fig. [T2j it actually reaches zero at a 
finite temperature of j3J = ln(3). Thus for any smaller 
value of /3J, the density matrix is separable and may be 
decomposed entirely in terms of CPS 

Now let us compute the average entanglement of a 
METTS decomposition of this system. To do so, we must 
first choose the underlying CPS |i) to be used in produc- 
ing our METTS. One possible choice is to take the set \i) 
to be the complete basis of z eigenstates. Because this 
yields a decomposition consisting of only four METTS 
we can calculate them exactly. The resulting average 
entanglement entropy of this "ZZ METTS" ensemble is 
plotted in Fig. [12] 

Note that we could have also used the pure state al- 
gorithm to sample this decomposition by collapsing each 
METTS into the z basis on every site and then the x 
basis on every site on alternate steps. The METTS pro- 
duced after either type of collapse can then be used to 
measure the entanglement since the system is rotation- 
ally invariant. Collapsing into the z basis alone, on the 
other hand, will not sample all four METTS since the 
Hamiltonian conserves total 5^. 

We may also choose CPS bases other than the S^- 
eigenstates and thereby find METTS decompositions 
with even lower average entanglement. Consider the 
XZ METTS derived from the orthonormal CPS basis 
{| I t^)i I I where up and down arrows 

denote 5*^ eigenstates and left and right arrows denote 
eigenstates. As shown in Fig. [T2j the average entan- 
glement of these METTS arc uniformly lower than the 
ZZ METTS. 

What is more, the variance of both energy and entropy 
is exactly zero for the XZ METTS ensemble. This follows 
from the fact that any XZ METTS can be transformed 
into any other by global spin rotations that commute 
with H. Therefore we can conclude that the XZ METTS 
are the optimal METTS decomposition for this system, 
at least up to a global spin rotation. Note that one may 
also sample the XZ METTS with the pure state method 
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FIG. 13: The average entanglement entropy of extended de- 
compositions of the two site S — 1/2 Heisenberg antiferro- 
magnet. Each optimal extension reaches the minimum al- 
lowed entanglement at some temperature /3o, but still exceeds 
the entanglement of the optimal XZ METTS at high enough 
temperatures. Also shown are the entanglement of the ZZ 
METTS decomposition and the energy eigenstate decompo- 
sition. 



by collapsing them into the x and z bases on alternating 
steps and on alternating sites. 

Regardless of the basis we choose, however, one can 



see from Fig. 12 that the average entanglement of each 
METTS decomposition goes to zero smoothly as /3 — >■ 
and always exceeds E[p]. Therefore, because even the 
optimal METTS decomposition is greater than E[p] for 
any finite temperature, we can conclude that METTS are 
not minimally entangled. 

However, we may ask if the optimal METTS decompo- 
sition is instead minimally entangled in a more limited 
sense. For this purpose it will be helpful to look at a 
special class of thermal density matrix decompositions. 
Beginning from a decomposition at some particular tem- 
perature 



e 



Pa = 



Zo 



(49) 



one can generate a decomposition for p at a different 
temperature /3 by making use of the relation 



(50) 



where the pure states defining the new decomposition are 
given by 



|V'.(/3))=e-('5-^''W2|V;.)/g(z)i/^ 



(51) 
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FIG. 14: The average entanglement entropy of extended de- 
compositions of the two site S — 1/2 Heisenberg antiferro- 
magnet with an appUed field /i = 1.5 J. In the presence of 
the field, the optimal METTS decomposition is given by the 
XY METTS. Also shown are the entanglement of the XZ and 
ZZ METTS decompositions and the energy eigenstate decom- 
position. 



We may consider this new decomposition the extension 
of the decomposition of Eq. (49), or an extended decom- 
position for short. 



Note that the entanglement of an 
extended decomposition is always an analytic function of 
/3. For our two qubit system then, this implies that no 
extended decomposition can have an entanglement equal 
to E[p\ at all temperatures. 

In fact, we would like to conjucture that no extended 
decomposition can even have an entanglement below the 
optimal METTS decomposition at all temperatures. To 
see why this may be the case, let us attempt to construct 
a counterexample for the two qubit system. At any tem- 
perature /3o, we can construct an optimal decomposition, 
following Wootters, such that the entropy of each state 
IV'i) is equal to the entanglement of formation. Then, 
because the extensions of these decompositions will have 
the least possible entanglement at /3q, we expect that they 
will also have low entanglement at other temperatures. 

Now, there can be multiple optimal decompositions for 
each starting temperature /3o- However, we have found 
that there is always one whose extension is less entan- 
gled than the others and has zero entropy variance for 



all temperatures, not just Po- As shown in Fig. [13) while 
such optimal extensions have less entanglement than the 
optimal METTS at low temperatures, they eventually 
exceed the optimal METTS as the temperature is in- 
creased. Moreover, the less entangled these extensions 
become at high temperature, the more they resemble the 
optimal METTS ensemble itself. 

To check that this behavior is general, we can also 
study the two qubit system in a field with Hamiltonian 



Here we focus on the case J > and h > 0. For small 
fields, the temperature dependence of the entanglement 
of formation E[p] remains similar to the zero field case. 
Then at h — J, the system undergoes a first-order quan- 
tum phase transition such that the ground state is no 
longer a maximally entangled singlet, but the fully polar- 
ized CPS with zero entanglement. And once h is greater 
than J, the entanglement of formation no longer behaves 
as a monotonic function of temperature. Regardless of 
the value of h, though, E[p] is strictly zero for f3J < ln(3) 
indicating that p becomes separable above this tempera- 
ture. 



H = J Sa- Sb -h{S% + S%) 



(52) 



In Fig. 14 we have taken a particular value of the field 
h > J and plotted the entropies of various optimal exten- 
sions. As before, each extension either exceeds the opti- 
mal METTS entanglement over a significant temperature 
range or else approaches the optimal METTS decompo- 
sition ever more closely as this range is reduced. Also, it 
is interesting to note that in the presence of a field it is 
the XY METTS (generated from S'' and S'f eigenstates) 
which now form the optimal METTS decomposition, at 
least up to a rotation about the z axis. This is because 
the XZ METTS are no longer required to be physically 
identical by the symmetries of the Hamiltonian and will 
exhibit a finite entropy variance. 

Going forward, it will be interesting to explore the ex- 
tent to which the observations made here hold for larger 
systems and more complex models. It would be especially 
useful to know if an optimal METTS decomposition with 
ideal sampling efficiency always exists and whether it can 
be efficiently computed. 

And although the average entanglement entropy of a 
METTS decomposition is not an entanglement measure 
in a strict sense (for instance, it is not zero when p is sep- 
arable),!^ it is conceivable that its scaling behavior could 
reveal non-trivial information about a system. Whether 
or not one must find an optimal METTS decomposition 
to access such information remains to be seen. But, be- 
cause the METTS pure state algorithm is efficient enough 
to simulate moderately large two-dimensional systems, 
the entanglement properties of METTS could turn out 
to be a useful tool in characterizing exotic phases or crit- 
ical points. 
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Appendix: Working with Matrix Product States and 
Matrix Product Operators 

A matrix product state (MPS) is a scheme for repre- 
senting a wavefunction, either exactly or approximately, 
by writing its amplitudes in some particular basis as a 
product of matrices. A general MPS may be written as 



A' 



\SN) 



(53) 



where the site labels Si take on values Si = 1,2, ... ,d 
labeling the local basis. 

For an adjacent pair of matrices , if the in- 

dex v summed between the matrices takes on values 
1,2, ... ,m one says that the bond between matrices i 
and i + 1 has bond dimension m. Though m can vary 
from bond to bond, in the following discussion we assume 
that it has a fixed value for a given MPS. And, while in 
the presence of periodic boundary conditions one must 
trace over the above product of matrices in order to re- 
duce it to a scalar amplitude, in what follows we work 
exclusively with open boundary conditions such that A"^ 
and A*™ are just 1 x m and m x 1 matrices, respectively. 

A matrix product operator (MPO) is defined similarly 
to an MPS except that the decomposition is over a basis 
of local operators. A general MPO may therefore be 
written as 

W=J2 W'^'^ ■■■W'"''-\h)...\tN){si\...{sN\ (54) 

{t,s} 

where for open boundary conditions the dimensions of 
the first and last matrix are again 1 x to and to x 1. 



1. Creating MPOs 

The simplest MPOs to create are those describing 
products of local operators, that is = HjOj. In partic- 
ular, one can represent a single site operator Oi this way 
by taking Oj = 1 for all j ^ i. If the operator W is such 
a site-factorizcd product, its MPO representation has a 
simple product structure as well. Each is actually 

just a 1 X 1 matrix defined by 11/*'"' = {ti\Oi\si). 

Another simple class of MPOs are those describing 
translationally invariant sums of operators in one dimen- 
sion in which case the matrices W can be taken to be 
identical (except at the boundaries) and of lower tri- 
angular form. For example, the MPO of the operator 
^tat = Si^i '^i t>e represented in terms of the bulk 
matrices 



together with the boundary matrices H^*!"! — [0 J'^^^] 
and W*""" = [(5*""" 0]^. Translationally invariant one 
dimensional Hamiltonians can likewise be represented by 



translationally invariant lower triangular MPOs of bond 
dimension 3 and higher.^"* 

However, there may not be always such a simple, ex- 
plicit prescription for operators of interest, in particular 
Hamiltonians of two dimensional systems. For use in 
such cases, we present below general algorithms for the 
addition and multiplication of arbitrary MPOs. 

Equipped with these algorithms, the MPO representa- 
tion of any Hamiltonian that is a sum of local terms may 
be found by first obtaining the MPOs for each term as 
described above and then summing them together. The 
MPO multiplication algorithm is then useful for forming 
other important operators such as for specific heat 
measurements or e~'^^ for imaginary time evolution. 



2. Adding MPS and MPOs 

First consider adding two MPS |Vm) and ji/'s) with 
bond dimensions uia and niB- If IV'c) = \4'a) + \iPb), 
then 



C = A'^ © B'^ . 
which in block form is 



A"' 
B'- 



(55) 



(56) 



More formally, one may write this definition of C"*' as 



C. 



Xi-lXi 



where the x indices run over values 1,2, ... , {niA + mB). 

After constructing the C matrices as above, the addi- 
tion has been carried out exactly. However, one usually 
wants to work with an MPS for \iIjc) that has a bond 
dimension comparable to ruA or rriB, not tua + mB. One 
therefore usually truncates the MPS representation for 
iV'c) as follows. 

Treating Ci = as a matrix Cs-^xi i compute its SVD 



(58) 



During this step, only mc singular values are kept such 
that x'l — 1; 2, ... , mc- Finally, replace Ci with C^] — 

Xi 

Us,y' and replace Ca with C"? = A^. v\ C?.\ . 

"iXl ^ XiX2 Xl XiXl XlX2 

The next bond in the truncated MPS for \tjjc) is now 
found by treating the new C2 as a matrix C'(s2xi)x2 ^^'^ 
computing its SVD. The matrix C2 is then set to U2 and 
C3 is multiplied by A2V2 just as C2 was in the first step. 
This truncation process may then be likewise repeated 
for every remaining C matrix. 
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The addition of MPOs proceeds in complete analogy 
to the addition of MPS. To compute the MPO Z such 
that Z ^ W + X , one defines the matrices Z as 



(59) 



Now, to carry out the multiplication beginning with 
the first site, define the tensor i?i as 



(60) 



where the only difference from the MPS case is that there 
are now cP matrices to be added instead of d. Thinking of 
the pair of indices Si and ij as a fat index (siti), one may 
likewise truncate the matrices Z by performing sequential 
SVDs as above. 

The MPS addition algorithm above has a computa- 
tional cost of Nm^cP for = uib = rnc = rn. Simi- 
larly, the addition of MPOs scales as Nk^d'^ where k is 
the MPO bond dimension. 



3. Multiplying MPOs 

Consider the task of multiplying two MPOs W and X 
with the goal of producing a product MPO Z = WX. 
Let these MPOs be written as products of W, X and Z 
matrices with bond dimension k. 

This multiplication may be carried out one site at a 
time in a process analogous to the zip-up algorithm dis- 
cussed above. Therefore, the first step is to map each 
MPO to an MPS by combining its t and s (or primed 
and unprimed) indices at each site into a fat index. These 
MPS should then be transformed such that their orthog- 
onality center is at the first site, and then mapped back 
into MPOs. In doing so, one ensures that in the trunca- 
tion step below, the basis of states surrounding the local 
tensor is not overly ill-conditioned. However, because 
this basis will not be orthonormal, it is important to use 
a truncation error cutoff instead of a fixed k to determine 
the final MPO bond dimension. 



Thinking of i?i as a matrix R(^siti){LJi^i): compute its 
SVD to obtain 



R 



(61) 



where the singular values are truncated such that only 
the k largest are kept. The first matrix of Z may now be 
defined as Z^i'^ = U,^s^t^Ki- 

The remaining Z matrices may now be found by pro- 
ceeding recursively. Define i?2 in terms of Ri as 



■Cl"2?2 



^Cl(sitl) 



E 



€i52 



(62) 

Again, i?2 may be thought of as a matrix and its SVD 
computed as 



-^(s2t2Cl)('^2«2) - ^(s2«2Cl)C2^C2"^cl(a;2«2) 



(63) 



The second matrix of Z may now be defined as 
^Cicl ~ ^(s2t2Ci)C2 algorithm continued by defin- 

ing i?3 analagously to i?2 above in terms of U2, R2, W3 
and X3. 



If the contractions in Eq. ( 62 ) are carried out in the or- 



der indicated, this algorithm for multiplying MPOs scales 
as Nk^d^ with the most expensive step being the con- 
struction of the R tensors. 
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